Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move the Service controller to use the new controller model. #1208

Merged
merged 1 commit into from
Jun 14, 2018

Conversation

mattmoor
Copy link
Member

Fixes: #1134

@google-prow-robot google-prow-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jun 14, 2018
@knative-metrics-robot
Copy link

The following is the coverage report on pkg/. Say /test pull-knative-serving-go-coverage to run the coverage report again

File Old Coverage New Coverage Delta
pkg/apis/serving/v1alpha1/service_types.go 87.5% 47.5% -40.0
pkg/apis/serving/v1alpha1/configuration_types.go 97.7% 97.7% 0.1
pkg/apis/serving/v1alpha1/configuration_types_test.go 97.7% 97.7% 0.1
pkg/apis/serving/v1alpha1/revision_types.go 96.8% 98.4% 1.6
pkg/controller/configuration/configuration.go 78.3% 75.8% -2.5
pkg/controller/configuration/configuration_test.go 78.3% 75.8% -2.5

*TestCoverage feature is being tested, do not rely on any info here yet

@knative-metrics-robot
Copy link

The following is the coverage report on pkg/. Say /test pull-knative-serving-go-coverage to run the coverage report again

File Old Coverage New Coverage Delta
pkg/apis/serving/v1alpha1/configuration_types.go 97.7% 97.7% 0.1
pkg/apis/serving/v1alpha1/configuration_types_test.go 97.7% 97.7% 0.1
pkg/apis/serving/v1alpha1/revision_types.go 96.8% 98.4% 1.6
pkg/apis/serving/v1alpha1/service_types.go 87.5% 98.3% 10.8
pkg/apis/serving/v1alpha1/service_types_test.go 87.5% 98.3% 10.8
pkg/controller/configuration/configuration.go 78.3% 78.9% 0.7
pkg/controller/configuration/configuration_test.go 78.3% 78.9% 0.7

*TestCoverage feature is being tested, do not rely on any info here yet

@google-prow-robot google-prow-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 14, 2018
@knative-metrics-robot
Copy link

The following is the coverage report on pkg/. Say /test pull-knative-serving-go-coverage to run the coverage report again

File Old Coverage New Coverage Delta
pkg/apis/serving/v1alpha1/revision_types.go 96.8% 98.4% 1.6
pkg/apis/serving/v1alpha1/service_types.go 87.5% 98.3% 10.8
pkg/apis/serving/v1alpha1/service_types_test.go 87.5% 98.3% 10.8

*TestCoverage feature is being tested, do not rely on any info here yet

This moves the service controller to the more level-based model started in #1202.  This also makes the service controller start to listen to Route and Configuration events to trigger reconciliation, so that we can reflect a basic overall readiness condition based on the combination of these two sub-resources readiness conditions (I don't expect this will be the final model, but is a start).

Fixes: knative/serving#1134
@knative-metrics-robot
Copy link

The following is the coverage report on pkg/. Say /test pull-knative-serving-go-coverage to run the coverage report again

File Old Coverage New Coverage Delta
pkg/apis/serving/v1alpha1/service_types.go 87.5% 98.3% 10.8
pkg/apis/serving/v1alpha1/service_types_test.go 87.5% 98.3% 10.8
pkg/apis/serving/v1alpha1/revision_types.go 96.8% 98.4% 1.6

*TestCoverage feature is being tested, do not rely on any info here yet

@mattmoor
Copy link
Member Author

I'm going to remove the WIP since I think this is incremental progress. The service controller parts of this lack coverage, but partly because the entire service controller is uncovered.

I'm going to start working on that in a fresh commit / PR based on this.

/assign @vaikas-google
/assign @jonjohnsonjr
/assign @grantr

@mattmoor mattmoor changed the title WIP Move the Service controller to use the new controller model. Move the Service controller to use the new controller model. Jun 14, 2018
@google-prow-robot google-prow-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 14, 2018
@jonjohnsonjr
Copy link
Contributor

/lgtm
/approve

@google-prow-robot google-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 14, 2018
@google-prow-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jonjohnsonjr, mattmoor

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-prow-robot google-prow-robot merged commit 07db2f6 into knative:master Jun 14, 2018
@mattmoor mattmoor deleted the steve-time branch June 14, 2018 16:03
@@ -183,7 +189,8 @@ func (ss *ServiceStatus) RemoveCondition(t ServiceConditionType) {
}

func (ss *ServiceStatus) InitializeConditions() {
sct := []ServiceConditionType{ServiceConditionReady}
sct := []ServiceConditionType{ServiceConditionReady,
ServiceConditionConfigurationReady, ServiceConditionRouteReady}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can each of these be on a new line - for when we add new conditions in the future

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will push a change to normalize to that style across *_types.go

@@ -193,3 +200,67 @@ func (ss *ServiceStatus) InitializeConditions() {
}
}
}

func (ss *ServiceStatus) PropagateConfiguration(cs ConfigurationStatus) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer PropagateConfigurationStatus

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will push a change with this rename.

}
}

func (ss *ServiceStatus) PropagateRoute(rs RouteStatus) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer PropagateRouteStatus

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will push a change with this rename.

checkConditionFailedService(svc.Status, ServiceConditionRouteReady, t)

// Fixed the glitch.
svc.Status.PropagateRoute(RouteStatus{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest more granularity and test the scenarios below. Let me know if it's too pedantic. I'm suggesting this because there's no test for when a configuration fails and recovers. And there's some overlap testing the route failure flow in TestTypicalServiceFlow and TestRouteFailurePropagation

Each bullet point would be a scenario to test given the conditions when ...

service happy flow

when the service ready condition is unknown

  • route ready condition becomes true
  • configuration ready condition becomes true
  • both route and configuration ready condition becomes true

failure propagation

when the service ready condition is true
and both route & configuration ready conditions are true

  • route ready condition becomes false
  • configuration ready condition becomes false

configuration recovery

when the service ready condition is false
and route ready condition is true
and configuration ready condition is false

  • configuration ready condition becomes true

route recovery

when the service ready condition is false
and route ready condition is false
and configuration ready condition is true

  • route ready condition becomes true

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened #1224 to track this. This is a great list, thanks.

@@ -65,12 +58,17 @@ func NewController(

// obtain references to a shared index informer for the Services.
informer := elaInformerFactory.Serving().V1alpha1().Services()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we name this serviceInformer for clarity

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushing change


informers := []cache.SharedIndexInformer{informer.Informer()}
informers := []cache.SharedIndexInformer{informer.Informer(),
configurationInformer.Informer(), routeInformer.Informer()}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple lines

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushing a change.

// lister indexes properties about Services
lister listers.ServiceLister
// listers index properties about resources
lister listers.ServiceLister
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

serviceLister

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushing change.

// TODO(#642): Remove this (needed to avoid continuous updates)
desiredConfig.Spec.Generation = config.Spec.Generation

if diff := cmp.Diff(desiredConfig.Spec, config.Spec); diff == "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use https://github.com/kubernetes/apimachinery/blob/master/pkg/api/equality/semantic.go to make sure they are semantically the same

One reason being cmp.Diff doesn't seem to equate empty and nil maps as equal. Hence the need for these cmp options https://godoc.org/github.com/google/go-cmp/cmp/cmpopts

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, though IMO we should continue using cmp.Diff in tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left the cmp.Diff for logging to the controller logs, but switched the actual comparison the equality.Semantic.DeepEqual. Will add this to my change incorporating feedback.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grantr I agree tests should use cmp.Diff

// TODO(#642): Remove this (needed to avoid continuous updates)
desiredRoute.Spec.Generation = route.Spec.Generation

if diff := cmp.Diff(desiredRoute.Spec, route.Spec); diff == "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid cmp.Diff as mentioned above

Copy link
Contributor

@grantr grantr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly agree with @dprotaso's comments plus I have a few of my own. @mattmoor you may have already addressed these in other PRs that I haven't seen; feel free to ignore if so. 😺

@@ -183,7 +189,8 @@ func (ss *ServiceStatus) RemoveCondition(t ServiceConditionType) {
}

func (ss *ServiceStatus) InitializeConditions() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not opposed to this solution, but seems like defaulter-gen might be cleaner. If only defaulter-gen were documented...

} else if err != nil {
logger.Errorf("Failed to reconcile Service: %q failed to Get Configuration: %q; %v", service.Name, configName, zap.Error(err))
return err
} else if config, err = c.reconcileConfiguration(service, config); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to skip reconciling the configuration when it's just been created. I'm curious if there's a reason not to reconcile the configuration immediately. The informer cache won't have it, but that's likely not a problem because we already have the latest state as returned by Create.

It's true that the service will be re-added to the workqueue almost immediately (depending on latency of the configuration informer), but if the workqueue is long there might be a noticeable delay before it comes back.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that all this would end up reconciling is changes made by mutating webhooks (or the API Server). It's a big problem if webhooks make changes that disagree with our reconciliation (we'll flip-flop every resync), so I'm not sure I see the value in this. I may be naive here, but what would we miss by not doing this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is that I don't see any documentation or test that createConfiguration, reconcileConfiguration, and the webhook must all result in exactly the same object, but the code relies on that being true in subtle ways. It would be easy to change only one of the create or reconcile methods without changing the other, since things would still appear to work in practice (because of the immediate second reconcile).

Probably the easiest solution for this (minus the webhook issue) is to add a unit test that diffs the output of createConfiguration and reconcileConfiguration.

An alternative is to allow reconcileConfiguration to take a nil Configuration:

func (c *Controller) reconcileConfiguration(service *v1alpha1.Service, config *v1alpha1.Configuration) (*v1alpha1.Configuration, error) {
  desiredConfig := MakeServiceConfiguration(service)
  if config == nil {
    return c.ElaClientSet.ServingV1alpha1().Configurations(service.Namespace).Create(config)
  else {
    desiredConfig.Spec.Generation = config.Spec.Generation
    if equality.Semantic.DeepEqual(desiredConfig.Spec, config.Spec) != {
      config.Spec = desiredConfig.Spec
      return c.ElaClientSet.ServingV1alpha1().Configurations(service.Namespace).Update(config)
    }
  }
  return config, nil
}

The webhook issue is trickier to solve fully, but could at least be mitigated using defaulter-gen (because the generated defaulters would be defined in one place and used by both webhook and controller).

} else if err != nil {
logger.Errorf("Failed to reconcile Service: %q failed to Get Route: %q", service.Name, routeName)
return err
} else if route, err = c.reconcileRoute(service, route); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as above re: not reconciling the route immediately.

// TODO(#642): Remove this (needed to avoid continuous updates)
desiredConfig.Spec.Generation = config.Spec.Generation

if diff := cmp.Diff(desiredConfig.Spec, config.Spec); diff == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, though IMO we should continue using cmp.Diff in tests.

// Preserve the rest of the object (e.g. ObjectMeta)
config.Spec = desiredConfig.Spec
configClient := c.ElaClientSet.ServingV1alpha1().Configurations(service.Namespace)
return configClient.Update(config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd collapse these into one line without setting configClient, but I'm fine with this approach too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like I already did this in configuration.go, will add this to the change I'm pushing.

func (c *Controller) createConfiguration(service *v1alpha1.Service) (*v1alpha1.Configuration, error) {
configClient := c.ElaClientSet.ServingV1alpha1().Configurations(service.Namespace)
return configClient.Create(MakeServiceConfiguration(service))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function (and createRoute below it) do so little I wonder if it might be clearer to inline them into the reconcile.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have a mild preference for this slightly more readable method name for readability of the core reconciliation method, even if it is now one line :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants